Lifelong person re-identification (LReID) is in significant demand for real-world development as a large amount of ReID data is captured from diverse locations over time and cannot be accessed at once inherently. However, a key challenge for LReID is how to incrementally preserve old knowledge and gradually add new capabilities to the system. Unlike most existing LReID methods, which mainly focus on dealing with catastrophic forgetting, our focus is on a more challenging problem, which is, not only trying to reduce the forgetting on old tasks but also aiming to improve the model performance on both new and old tasks during the lifelong learning process. Inspired by the biological process of human cognition where the somatosensory neocortex and the hippocampus work together in memory consolidation, we formulated a model called Knowledge Refreshing and Consolidation (KRC) that achieves both positive forward and backward transfer. More specifically, a knowledge refreshing scheme is incorporated with the knowledge rehearsal mechanism to enable bi-directional knowledge transfer by introducing a dynamic memory model and an adaptive working model. Moreover, a knowledge consolidation scheme operating on the dual space further improves model stability over the long term. Extensive evaluations show KRC's superiority over the state-of-the-art LReID methods on challenging pedestrian benchmarks.
translated by 谷歌翻译
需求估计在动态定价中起着重要的作用,在动态定价中,可以通过基于需求曲线最大化收入来获得最佳价格。在在线酒店预订平台中,房间的需求或占用率随着房间类型而变化,随着时间的推移变化,因此获得准确的占用估算是一项挑战。在本文中,我们提出了一种新颖的酒店需求功能,该功能明确地模拟了对占用预测需求需求的价格弹性,并设计了价格弹性预测模型,以了解各种影响因素的动态价格弹性系数。我们的模型由精心设计的弹性学习模块组成,以减轻内生性问题,并在多任务框架中接受培训以解决数据稀疏性。我们在现实世界数据集上进行了全面的实验,并验证方法优于最先进的基准,以实现占用预测和动态定价。
translated by 谷歌翻译
我们提出了联合隐式功能(UNIF),这是一种基于原始扫描和骨骼作为输入的人类重建和动画的零件方法。先前的基于部分的人重建方法依赖于SMPL的地面零件标签,因此仅限于最小衣服。相比之下,我们的方法学会了将部分与身体运动分开,而不是部分监督,因此可以扩展到穿衣服的人类和其他铰接的物体。我们的分区从动作进行分区是通过以骨骼为中心的初始化,骨限度损失和正常损失来实现的,即使训练姿势受到限制,也可以确保稳定的零件分裂。我们还为SDF提供了最小的周边损失,以抑制额外的表面和部分重叠。我们方法的另一个核心是一种相邻的部分接缝算法,该算法会产生非刚性变形,以维持显着缓解基于部分伪像的部分之间的连接。在该算法下,我们进一步提出了“竞争部分”,该方法通过点对骨骼而不是绝对位置的相对位置定义了重量,从而避免了神经隐式函数的概括性问题(线性混合皮肤)。我们通过在CAPE和ClothSeq数据集上穿衣服的人体重建和动画来证明我们方法的有效性。
translated by 谷歌翻译
图表示学习在许多图挖掘应用中都起着重要作用,但是大规模图的学习嵌入仍然是一个问题。最近的工作试图通过图形摘要提高可扩展性 - 即,他们在较小的摘要图上学习嵌入,然后还原原始图的节点嵌入。但是,所有现有的作品都取决于启发式设计和缺乏理论分析。与现有作品不同,我们根据引入的内核矩阵对三种特定的嵌入学习方法进行了深入的理论分析,并揭示了通过图形摘要的学习嵌入实际上是在配置模型构造的近似图上学习嵌入的嵌入。我们还对近似误差进行了分析。据我们所知,这是对这种方法进行理论分析的第一项工作。此外,我们的分析框架可以解释某些现有方法,并为对此问题的未来工作提供了很好的见解。
translated by 谷歌翻译
颈腺细胞(GC)检测是计算机辅助诊断宫颈腺癌筛查的关键步骤。精确识别宫颈涂片中的GC是挑战的,其中鳞状细胞是主要的。在整个涂片线索中,广泛存在的分布(OOD)数据可降低机器学习系统用于GC检测的可靠性。尽管,最新的(SOTA)深度学习模型可以胜过感兴趣的预选区域中的病理学家,但是当面对这样的吉吉像素整个滑动图像时,质量假阳性(FP)预测仍无法解决。本文提出了一种基于GC的形态学知识,试图通过八邻居中的自我发项机制来解决FP问题的新极性知识。它估计了GC核的极性方向。作为插件模块,Polarnet可以指导一般对象检测模型的深度功能和预测的置信度。在实验中,我们发现基于四个不同框架的通用模型可以在小图像集中拒绝fp,并将平均精度(地图)的平均值增加$ \ text {0.007} \ sim \ sim \ text {0.015} $,其中平均最高超过了最近的宫颈细胞检测模型0.037。通过插入极地,部署的C ++程序在从外部WSI的前20个GC检测准确性上提高了8.8%,同时牺牲了14.4 s的计算时间。代码可在https://github.com/chrisa142857/polarnet-gcdet中找到
translated by 谷歌翻译
随着服务机器人和监控摄像头的出现,近年来野外的动态面部识别(DFR)受到了很多关注。面部检测和头部姿势估计是DFR的两个重要步骤。经常,在面部检测后估计姿势。然而,这种顺序计算导致更高的延迟。在本文中,我们提出了一种低延迟和轻量级网络,用于同时脸部检测,地标定位和头部姿势估计。灵感来自观察,以大角度定位面部的面部地标更具挑战性,提出了一个姿势损失来限制学习。此外,我们还提出了不确定性的多任务损失,以便自动学习各个任务的权重。另一个挑战是,机器人通常使用武器基的计算核心等低计算单元,我们经常需要使用轻量级网络而不是沉重的网络,这导致性能下降,特别是对于小型和硬面。在本文中,我们提出了在线反馈采样来增加不同尺度的培训样本,这会自动增加培训数据的多样性。通过验证常用的更广泛的脸,AFLW和AFLW2000数据集,结果表明,该方法在低计算资源中实现了最先进的性能。代码和数据将在https://github.com/lyp-deeplearning/mos-multi-task-face-detect上使用。
translated by 谷歌翻译
共同语音手势代表是合成一种手势序列,不仅与输入语音音频看起来也是真实的。我们的方法产生完整的上身的运动,包括武器,手和头部。虽然最近的数据驱动方法取得了巨大的成功,但仍然存在挑战,如有限的品种,忠诚度,缺乏客观指标。由于演讲无法完全确定手势的事实,我们设计了一种学习一组手势模板向量的方法来模拟缓解歧义的潜在条件。对于我们的方法,模板向量确定产生的手势序列的一般外观,而语音音频驱动身体的微妙运动,用于合成现实手势序列。由于手术语音同步的客观度量的诡计,我们采用唇部同步错误作为代理度量标准,以调整和评估模型的同步能力。广泛的实验表明了我们对客观和主观评估的方法的优越性和对富力度和同步的主观评估。
translated by 谷歌翻译
Recent CLIP-guided 3D optimization methods, e.g., DreamFields and PureCLIPNeRF achieve great success in zero-shot text-guided 3D synthesis. However, due to the scratch training and random initialization without any prior knowledge, these methods usually fail to generate accurate and faithful 3D structures that conform to the corresponding text. In this paper, we make the first attempt to introduce the explicit 3D shape prior to CLIP-guided 3D optimization methods. Specifically, we first generate a high-quality 3D shape from input texts in the text-to-shape stage as the 3D shape prior. We then utilize it as the initialization of a neural radiance field and then optimize it with the full prompt. For the text-to-shape generation, we present a simple yet effective approach that directly bridges the text and image modalities with a powerful text-to-image diffusion model. To narrow the style domain gap between images synthesized by the text-to-image model and shape renderings used to train the image-to-shape generator, we further propose to jointly optimize a learnable text prompt and fine-tune the text-to-image diffusion model for rendering-style image generation. Our method, namely, Dream3D, is capable of generating imaginative 3D content with better visual quality and shape accuracy than state-of-the-art methods.
translated by 谷歌翻译
Three-dimensional (3D) ultrasound imaging technique has been applied for scoliosis assessment, but current assessment method only uses coronal projection image and cannot illustrate the 3D deformity and vertebra rotation. The vertebra detection is essential to reveal 3D spine information, but the detection task is challenging due to complex data and limited annotations. We propose VertMatch, a two-step framework to detect vertebral structures in 3D ultrasound volume by utilizing unlabeled data in semi-supervised manner. The first step is to detect the possible positions of structures on transverse slice globally, and then the local patches are cropped based on detected positions. The second step is to distinguish whether the patches contain real vertebral structures and screen the predicted positions from the first step. VertMatch develops three novel components for semi-supervised learning: for position detection in the first step, (1) anatomical prior is used to screen pseudo labels generated from confidence threshold method; (2) multi-slice consistency is used to utilize more unlabeled data by inputting multiple adjacent slices; (3) for patch identification in the second step, the categories are rebalanced in each batch to solve imbalance problem. Experimental results demonstrate that VertMatch can detect vertebra accurately in ultrasound volume and outperforms state-of-the-art methods. VertMatch is also validated in clinical application on forty ultrasound scans, and it can be a promising approach for 3D assessment of scoliosis.
translated by 谷歌翻译
We represent the ResNeRF, a novel geometry-guided two-stage framework for indoor scene novel view synthesis. Be aware of that a good geometry would greatly boost the performance of novel view synthesis, and to avoid the geometry ambiguity issue, we propose to characterize the density distribution of the scene based on a base density estimated from scene geometry and a residual density parameterized by the geometry. In the first stage, we focus on geometry reconstruction based on SDF representation, which would lead to a good geometry surface of the scene and also a sharp density. In the second stage, the residual density is learned based on the SDF learned in the first stage for encoding more details about the appearance. In this way, our method can better learn the density distribution with the geometry prior for high-fidelity novel view synthesis while preserving the 3D structures. Experiments on large-scale indoor scenes with many less-observed and textureless areas show that with the good 3D surface, our method achieves state-of-the-art performance for novel view synthesis.
translated by 谷歌翻译